ON INFERENCE FOR FRACTIONAL DIFFERENTIAL EQUATIONS 



ALEXANDRA CHRONOPOULOU AND SAMY TINDEL 



Abstract. Based on Malliaviii calculus tools and approximation results, we show how 
to compute a maximum likelihood type estimator for a rather general differential equa- 
tion driven by a fractional Brownian motion with Hurst parameter H > 1/2. Rates of 
convergence for the approximation task are provided, and numerical experiments show 
that our procedure leads to good results in terms of estimation. 



1. Introduction 

In this introduction, we first try to motivate our problem and outline our results. We 
also argue that only a part of the question can be dealt with in a single paper. We briefly 
sketch a possible program for the remaining tasks in a second part of the introduction. 

1.1. Motivations and outline of the results. The inference problem for diffusion pro- 
cesses is now a fairly well understood problem. In particular, during the last two decades, 
several advances have allowed to tackle the problem of inference based on discretely ob- 
served diffusions [T0| [36| HQ] , which is of special practical interest. 

More specifically, consider a family of stochastic differential equations of the form 

Yt = a+ !\{Ys-e)ds + y2 f a\Y,-e)dBl tG[0,r], (1) 

JQ Jo 

where a G M™, 9) : M™ — )• and cr(-; 9) : — M™-'"^ are smooth enough functions, 
5 is a d-dimensional Brownian motion and 6' is a parameter varying in a subset C M''. 
If one wishes to identify 9 from a set of discrete observations of Y, most of the methods 
which can be found in the literature are based on (or are closely linked to) the maximum 
likelihood principle. Indeed, if S is a Brownian motion and Y is observed at some equally 
distant instants ti = ir for i = 0, . . . ,n, then the log-likelihood of a sample {Yt-^, . . . , Yt^) 
can be expressed as 

n 

CW = 5^1n(p(r,F,,_„y,j^)), (2) 
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where p stands for the transition semi-group of the diffusion Y . If Y enjoys some ergodic 
properties, with invariant measure z/^^ under Pg^,, then we get 

a.s.- hm -C(^) = Eg,, b(r,Zi,Z2; 9)] ^ Je^O), (3) 

n— s>oo n 

where Zi ~ vq^^ and C{Z2 \ Zi) = p{t, Zi,- ; 9). Furthermore, it can be shown in a general 
context that 9 i— )■ Jeo{9) admits a maximum at 6* = 6'o. This opens the way to a MLE 
analysis which is similar to the one performed in the case of i.i.d observations, at least 
theoretically. 

However, in many interesting cases, the transition semi-group p is not amenable to 
explicit computations, and thus expression ([2]) has to be approximated in some sense. 
The most common approach, advocated for instance in [36], is based on a linearization of 
each p(r, y^- .^, y^.; 6'), which transforms it into a Gaussian density 

M {Y,^_, + r, aa*{Yu_,-9) r) . 

This linearization procedure is equivalent to the approximation of equation ([1]) by an 
Euler (first order) numerical scheme. Refinements of this procedure, based on Milstein 
type discretizations, are proposed in [TU] . 

Some special situations can be treated differently (and often more efficiently): for in- 
stance, in case of a constant diffusion coefficient, the continuous time likelihood can be 
computed explicitly by means of Girsanov's theorem. When the dimension of the driving 
Brownian motion B is d = 1, one can also apply Ito's formula in order to be back to 
an equation with constant diffusion coefficient, or use Doss-Sousman representation of 
solutions to ([1]). Let us also mention that statistical inference for SDKs driven by Levy 
processes is currently intensively investigated, with financial motivations in mind. 

The current article is concerned with the estimation problem for equations of the 
form ([1]), when the driving process B is a fractional Brownian motion. Let us recall 
that a fractional Brownian motion B with Hurst parameter H G (0, 1), defined on a com- 
plete probability space (f2, J-", P), is a d-dimensional centered Gaussian process. Its law 
is thus characterized by its covariance function, which is given by 

E [BlBi] = \ + s^" - \t - sr) t G M+. 

The variance of the increments of B is then given by 



E 



[Bl - Bl) 



s,tGM+, i = l,...,d, 



and this implies that almost surely the fBm paths are 7-Holder continuous for any j < H. 
Furthermore, for H = 1/2, fBm coincides with the usual Brownian motion, converting 
the family {B^; H G (0, 1)} into the most natural generalization of this classical process. 

In the last decade, some important advances have allowed to solve |33[ H3] and un- 
derstand [T9| IM] differential systems driven by fBm for H G (1/2, 1). The rough paths 
machinery also allows to handle fBm with H G (1/4, 1/2), as nicely explained in [HI [TH 
EZ1|29]. However, the irregular situation if G (1/4, 1/2) is not amenable to useful moments 
estimates for the solution F to ([T]) together with its Jacobian (that is the derivative with 
respect to the initial condition). This is why we concentrate, in the sequel, on the simpler 
case H > 1/2 for our estimation problem. In any case, many real world noisy systems are 
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currently modeled by equations like ([T]) driven by fBm, and this is particularly present in 
the Biophysics literature, as assessed by [25| [35], or for Finance oriented applications as 
in [131 [201 EB ESI 112] ■ This leads to a demand for rigorous estimation procedures for 
SDEs driven by fractional Brownian motion, which is the object of our paper. 

Concerns about the inference problem for fractional diffusion processes started a decade 
ago with the analysis of fractional Ornstein-Uhlenbeck processes [23] . Then a more recent 
representative set of references on the topic includes |371HT] . More specifically, |41] handles 
the case of a one-dimensional equation of the form 

Yt = a + e [ fi{Z)ds + B,, te[0,T], (4) 
Jo 

where /x is regular enough, and where i? is a fBm with H E (0,1). The simple dependence 
on the parameter 6 and the fact that an additive noise is considered enables the use of 
Girsanov's transform in order to get an exact expression for the MLE. Convergence of the 
estimator is then obtained through an extensive use of Malliavin calculus. 

As far as |37] is concerned, it is focused on the case of a polynomial equation, for 
which the exact moments of the solution can be computed. The estimator relies then 
on a generalization of the moment method, which tries to fit empirical moments of the 
solution with their theoretical value. The range of application of this method is however 
confined to specific situations, for the following reasons: 

• It assumes that independent runs of equation ([1]) can be obtained, which is 
usually not the case. 

• It hinges on multiple integrals computations, which are time consuming and are 
avoided in most numerical schemes. 

As can be seen from this brief review, parameter estimation for rough equations is still in 
its infancy. We shall also argue that it is a hard problem. 

Indeed, if one wishes to transpose the MLE methods used for diffusion processes to the 
fBm context, an equivalent of the log-likelihood functions should first be produced. 
But the covariance structure of B is quite complex and the attempts to put the law of Y 
defined by ([1]) into a semigroup setting are cumbersome, as illustrated by [H [T71 |3T| . We 
have thus decided to consider a highly simplified version of the log-likelihood. Namely, 
still assuming that Y is observed at a discrete set of instants < ti < ■ ■ ■ < t„ < C)0, set 

n 

Ue) = J2inUXt,,Y,^;e)), (5) 

i=l 

where we suppose that under Pg the random variable Yt- admits a density z i— )■ f{ti, z; 9). 
Notice that in case of an elliptic diffusion coefficient cr the density f{ti,-]6) is strictly 
positive, and thus expression ([5]) makes sense by a straightforward application of |1H 
Proposition 19.6]. However, the successful replication of the strategy implemented for 
Brownian diffusions (that we have tried to summarize above) relies on some highly non 
trivial questions: existence of an invariant measure for equation ([T]), rate of convergence 
to this invariant measure, convergence of expressions like , characterization of the limit 
in terms of 6* as in ([2]), to mention just a few. We shall come back to these considerations 
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in the next section, but let us insist at this point on the fact that all those questions would 
fit into a research program over several years. 

Our aim in this paper is in a sense simpler: we assume that quantities like ([5]) are 
meaningful for estimation purposes. Then we shall implement a method which enables to 
compute iniO) and optimize it in 6, producing thus a pseudo MLE estimator. We focused 
first on this specific aspect of the problem for the following reasons: 

(1) From a statistical point of view, it is obviously essential to obtain a computa- 
tionally efficient estimation procedure. This will allow us for instance to evaluate 
numerically the accuracy of our method. 

(2) The procedure itself is nontrivial, and requires the use of advanced stochastic 
analysis tools: probabilistic representation of the density, Malliavin type integra- 
tion by parts, Stratonovich-Skorohod correction terms, discretization of systems 
of pathwise stochastic differential equations for instance. 

We have thus decided to tackle the implementation issues first. If it turns out to be 
satisfying, we shall then try to proceed to a full justification of our method. 

Let us also mention that it might not be clear to the reader that in{d) can be meaningful 
in terms of statistical estimation, since it only involves evaluations at single points Yf^. 
However our numerical experiments indicate that this quantity behaves nicely for our 
purposes. Moreover, it will become clear from the forthcoming computations that our 
methodology can be extended to handle quantities like 

n 
i=l 

where f{s,t,x,z;6) stands for the density of the couple {Ys,Yt). This kind of pseudo 
log-likelihood is obviously closer in spirit to the diffusion case. Densities of tuples could 
also be considered at the price of technical complications. 

Let us now try to give a flavor of the kind of result we shall obtain in this article, in a 
very loose form: 

Theorem 1.1. Consider Equation ^ driven by a d- dimensional fractional Brownian 
motion B with Hurst parameter H > 1/2. Assume and a are smooth enough coefficients, 
and that aa* is strictly elliptic. For a sequence of times to < ■ ■ ■ < tn < oo, let yt-, 
i = 1, . . . ,n be the corresponding observations. Then: 

(i) The gradient of the log-likelihood function admits the following probabilistic represen- 
tation: Viln{0) = XlILi W7{e)' 



Wi{9) = E 



(6) 



where H(^j-^^___j^-^{Yt.{6)) is an expression involving Malliavin derivatives an Skorohod inte- 
grals ofY{9). A similar expression is also available for Vi{9). 

(ii) A computational procedure is constructed in order to obtain ... suit- 
able way. 

(iii) When Yt is replaced by its Euler scheme approximation with step T /M and expected 
values in ^ are approximated thanks to N Monte Carlo steps, we show that 
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• can be chosen in function of M in an optimal way (see Proposition^^. 

• The corresponding approximation ofViiniO) converges to the real one with rate 
,^-{27-1) fpr any 1/2 < 7 < iJ. 

All those results are stated in a more rigorous way in the remainder of the article. 

Here is how our article is structured: we give some preliminaries and notations on Young 
and Malliavin calculus for fractional Brownian motion at Section [2l The probabilistic 
representation for the log-likelihood is given at Section [3l Discretization procedures are 
designed at Section HJ and finally numerical examples are given at Section [5l 

1.2. Remaining open problems. We emphasized above the fact that only a part of 
the problem at stake was going to be solved in the current article. We now briefly sketch 
the remaining tasks to be treated. 

The most important obstacle in order to fully justify our methodology is to get a 
suitable convergence theorem for in{6)/n, where in{d) is defined by ([5]). In a natural way, 
this should be based on some strong ergodicity properties for Yt. After a glance at the 
literature on ergodicity for fractional systems, one can distinguish two cases: 

(i) When <t{-;6) is constant, the convergence of C{Yt) as t — >■ 00 is established in |15] . 
with a (presumably non optimal) rate of convergence 

(a) For a general smooth and elliptic coefficient a, only the uniqueness of the invariant 
measure is shown in |17] . with an interesting extension to the hypoelliptic case in |18] . 
Nothing is known about the convergence of C{Yt), not to mention rates. 

This brief review already indicates that the convergence to invariant measures is still quite 
mysterious for fractional differential equations, at least for a non constant coefficient 
a. Moreover, recall that if 1/(6) stands for the invariant measure corresponding to the 
system with coefficients fi{-;6),a{-;6), we also wish to retrieve some information on the 
dependence 6 u{6) (See [IS] for some partial results in this direction). 

Let us mention another concrete problem: even in the case of a constant a, the conver- 
gence of C{Yt) to an invariant measure i^(6') is proven in [15] in the total variation sense. 
In terms of the density p{t, x; 9) of 1^, it means that p(t, ■; 9) converges to the density of 
V in L} topology. However, in order to get a limit for £„(^)/n, one expects to use at least 
a convergence in some Sobolev space W""'^ for a,p large enough. 

One possibility in order to get this sharper convergence is to bound first the density 
p(t, ■]9) in another Sobolev space W"" and then to use interpolation theory. It seems 
thus sufficient to obtain Gaussian bounds on p{t, ■]9), uniformly in t. In case of Brownian 
diffusions, these Gaussian bounds are obtained by analytic tools, thanks to the Markov 
property. This method being obviously not available for systems driven by fBm, a possible 
inspiration is contained in the upper Gaussian bounds for the stochastic wave equation 
which can be found in [S]. The latter technical results stem from an intensive use of Malli- 
avin calculus, which should also be invoked in our case, and notice the recent efforts [21 [3] 
in this direction. 

Finally, let us mention that it seems possible to produce some reasonable convergent 
parametric estimators for equations driven by fBm in a rather general context. Among 
the methods which can be adapted from the diffusion case with the current stochastic 
analysis techniques, let us mention the least square estimator of |22], as well as the local 
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asymptotic normality property shown in jT2]. However, it seems obvious that the road to 
a complete picture of parameter estimation for stochastic equations driven by fBm is still 
hard and long. We hope to complete it in some subsequent communications. 



As mentioned in the introduction, we are concerned with equations driven by a d- 
dimensional fractional Brownian motion B. We recall here some basic facts about the 
way to solve those equations, and some Malliavin calculus tools which will be needed later 
on. Let us introduce first some general notation for Holder type spaces: 

Notation 2.1. We will denote by C^lV) the set of V -valued a- Holder functions for any 
a G (0,1), and by CJ^{U;V) the set of n times differentiable functions, bounded together 
with all their derivatives, from U to V . In the previous notation, U and V stand for two 
finite dimensional vector spaces. The state space V can be omitted for notational sake 
when its value is non ambiguous. When we want to stress the fact that we are working on 
a finite interval [0,T], we write C^{V) for the space of a- Holder functions f from [0,T] 
to V. The corresponding Holder norms shall be denoted by ||/||q,t- 

2.1. Differential equations driven by fBm. Recall that the equation we are interested 
in is of the form ([1]). Before stating the assumptions on our coefficients we need an 
additional notation: 

Notation 2.2. For n,p > 1, a function f E C^(]R"; M) and any tuple {ii,...ip) G 
{1, . . . , d}P, we set di,,,,i f for jr-^-^fT—- Similarly, consider a function gg G C(G; M.), for 
n^p > 1 and a vector of parameters 9 ^ Q C MS . For any tuple (zi, . . . Zp) G {1, . . . , g}^; 
we set Vi,,„i^gl for where i = l,...,n. 

Using this notation, we work under the following set of assumptions: 

Hypothesis 2.3. For any 6 e Q, we assume that 6) : W ^ and a{-; 6) -.R"" ^ 
^rn,d ^j,g ^2 coefficients. Furthermore, we have 

2 



When equation ([T]) is driven by a fBm with Hurst parameter H > 1/2 it can be 
solved, thanks to a fixed point argument, with the stochastic integral interpreted in the 
(pathwise) Young sense (see e.g. [S]). Let us recall that Young's integral can be defined 
in the following way: 

Proposition 2.4. Let f e C , g e with 'y + k > 1, and < s < t < 1. Then 
the integral g^ df^ is well-defined as limit of Riemann sums along partitions of [s,t]. 
Moreover, the following estimation is fulfilled: 



2. Preliminaries and notations 



supE E llvl....M-;^)lloo + l|v:^..,,a(-;^)|U < oo. 



1=0 l<iivi*i<9 




(7) 
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where the constant C only depends on 7 and k. A sharper estimate is also available: 



< \9s\ wfut - + c,,.ii/ii,ii^iuit - (8) 



With this definition in mind and under assumptions 12.31 we can solve our differential 
system of interest, and the following moments bounds are proven in [TT| [T9] : 

Proposition 2.5. Consider a fBm B with Hurst parameter H > 1/2. Then: 

(1) Under Hypothesis \ 2. 31 equation (Cjj driven by B admits a unique fi-Holder continuous 
solution Y , for any /3 < H. 

(2) Furthermore, 

\\y\\T,i3 < \a\ + c/,t||5||^(t- 

(3) If we denote by the solution to (Qp with initial condition a, then 

\\Y' - F"i|T,/3 <\h-a\ exp (c/,t||5|| J^?) . 

(4) If we only assume that f has linear growth, with VfjV^f bounded, the following 
estimate holds true: 



supte[o,T]\yt\ < (1 + exp (^c/,t||5|| J'^) • 



Remark 2.6. The framework of fractional integrals is used in jT^ in order to define integrals 
with respect to B. It is however easily seen to be equivalent to the Young setting we have 
chosen to work with. 

Some differential calculus rules for processes controlled by fBm will also be useful in 
the sequel: 

Proposition 2.7. Let B be a d- dimensional fBm with Hurst parameter H > 1/2. Con- 
sider a,d b,b ^ C^(R'^) with a + H > 1, and c,c & Cj'(R) (all these assumptions are 
understood in the almost sure sense). Define two processes z,z on [0,T] by 

Zt = a + / 6{ dBl + Cu du, and Zt = d + / 6{ dBl + Cu du. 
j^i Jo Jo Jo Jo 

Then for t G [0,T], one can decompose the product ZtZt into 

ZtZt = ad + y2 ZuVu + ZuK dBl + / [^u K + du, 
j^-^ Jo ^ Jo 

where all the integrals with respect to B are understood in the Young sense. 

The proof of this elementary and classical result is omitted here. See [2S1 Proposi- 
tion 2.8] for the proof of a similar rule. 
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2.2. Malliavin calculus techniques. Our representation of the density for the solution 
to ([1]) obviously relies on Malliavin calculus tools that we proceed now to recall. As already 
mentioned in the introduction, on a finite interval [0, T] and for some fixed H G (1/2, 1), we 
consider J-", P) the canonical probability space associated with a fractional Brownian 
motion with Hurst parameter H. That is, Q = Co([0, T]; M'') is the Banach space of 
continuous functions vanishing at equipped with the supremum norm, J-" is the Borel 
sigma-algebra and P is the unique probability measure on Q such that the canonical 
process B = {Bt, t G [0,T]} is a (i-dimensional fractional Brownian motion with Hurst 
parameter H . Remind that this means that B has d independent coordinates, each one 
being a centered Gaussian process with covariance Rnit, s) = + 1"^^ ~ 1^ ~ ^l"^^)- 

2.2.1. Functional spaces. Let S be the space of ci-dimensional elementary functions on 
[0,T]: 

rij — l 

S = {f = {h,...Jd); f^ = Y.<^lti,th.)^ = to<t><---<ii^_,<ti^=T, 

i=0 

forj = l,...,rf}. (9) 
We call T-L the completion of S with respect to the semi-inner product 

d 

{fi 9)n = ^(fi^ 9i)Ho^ where (l[o,t], l[o,.])wo — ^(^^ s,tG[0,T]. 

i=l 

Then, one constructs an isometry : H — )■ L^([0, 1]; M*^) such that 

(l[0,ti]5 • • • 5 l[0,y) = {'^[0,ti]^H(ti, ■),..., l[o^ta]KH{td, ■)) , 

where the kernel Kh is given by 

Knit, s) = chs^"" / ~ du 

and verifies that Rnit, s) = JJ*^* Knit, r)KH{s, r) dr, for some constant Ch- Moreover, let 
us observe that K"^ can be represented in the following form: for ip = [ipi, . . . , ipa) G "H, 
we have K'^ip 

K*Hip = {K*Hip\ K*hP>'') , where [KW]t = ^ p^AKuir, t) dr. 

2.2.2. Malliavin derivatives. Let us start by defining the Wiener integral with respect to 
B: for any element f in S whose expression is given as in ([9]), we define the Wiener 
integral of / with respect to B as 

d nj-l 
j=l i=0 

We also denote this integral as ftdBt, since it coincides with a pathwise integral with 
respect to B. 
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For ^ : M -T- M, and j E {1, . . . ,d}, denote by 9^^^ the function with values in M*^ having 
all the coordinates equal to zero except the j-th coordinate that equals to 6. It is readily 
seen that 



E 



5 (lb] \ Q -^[k] 



^j,kRs,t- 



This definition can be extended by linearity and closure to elements of H, and we obtain 
the relation 

E[B{f)B{g)] = {f,g)n. 
valid for any couple of elements f,g E H. In particular, B{-) defines an isometric map 
from "H into a subspace of L'^{Q). 

We can now proceed to the definition of Malliavin derivatives. With this notation 12.21 
in hand, let us consider S be the family of smooth functionals F of the form 

F = /(i?(/ii),...,5(/iO), (10) 

where hi,...,hn, G Ti, n > 1, and / is a smooth function with polynomial growth, 
together with all its derivatives. Then, the Malliavin derivative of such a functional F is 
the T^-valued random variable defined by 

n 

DF = Y,^^f{B{h), . . . ,B{K)) h,. 

For all p > 1, it is known that the operator D is closable from Lp{Q) into L^(f2; "H) (see 
e.g. Section 1]). We will still denote by D the closure of this operator, whose domain 
is usually denoted by D^'^ and is defined as the completion of S with respect to the norm 

\\Fh,p := {E{\Fn + E{\\DFr^))K 

It should also be noticed that partial Malliavin derivatives with respect to each component 
B^ of B will be invoked: they are defined, for a functional F of the form ( ITOl) and 
j = 1, . . . as 

n 

D^F = Y,dJ{B{h),. . . ,B{hn))h^f, 

i=l 

and then extended by closure arguments again. We refer to [321 Section 1] for the definition 
of higher derivatives and Sobolev spaces ©^'^ for k > 1. Another essential object related 
to those derivatives is the so-called Malliavin matrix of a M'^-valued random variable 
F e defined by 

jp= ((^DF\DF^^] . (11) 

V / l<jj'<m 

2.2.3. Skorohod integrals. We will denote by 6 the adjoint of the operator D (also referred 
to as the divergence operator). This operator is closed and its domain, denoted by Dom((5), 
is the set of valued square integrable random variables u G L^(f2; T-L) such that 

\E[{DF,u)n]\<C\\Fh, 

for all F G D^'^, where C is some constant depending on u. Moreover, for u G Dom(5), 
6{u) is the element of L^{Q) characterized by the duality relationship: 

E [FS{u)] = E [{DF, u)^] , for any F G P^'^. (12) 
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The quantity 6{u) is usually called Skorohod integral of the process u. 

Skorohod integrals are obviously analytic objects, not suitable for easy numerical im- 
plementations. However, they can be related to the Young type integrals introduced at 
Proposition 12.41 For this, we need to define another functional space as follows: 

Notation 2.8. We call {Til the space of measurable functions (f : [0,T] — > M'^ such that 



V?||^^l := / / |</?r| Iv^jil I?" — Mp'^ "^drdu < 
Jo Jo 



where Ch = H{2H — 1), and we denote by (■, ■)|-^| the associated inner product. We also 
write D^'^d'Hl) for the space 0/©^'^ functional with values in iTi]. 

The following proposition is then a slight extension of [32| Proposition 5.2.3]: 

Proposition 2.9. Let {u^ , t G [0, 1]}, for i = 1, . . . ,m and j = 1, . . . ,d, be a stochastic 
process in D^'^dHl) such that 

J2 \Dlui'\\t-s\''"^^dsdt< +00 a.s. (13) 

Jo Jo 

We also assume that almost surely, u has (5 -Holder paths with (5 + H > 1. Then the Young 
integral ^^=1 Ut dB{ exists and for all i = 1, . . . ,m can be written as 



j2 ruUBi=s{u^)+Y, r r Diu'^it-si'^-'dsdt, 

o-i JO ._i Jo Jo 



where 6{u) stands for the Skorohod integral of u. 

3. Probabilistic expression for the log-likelihood 

Recall that we are focusing on equation ([1]) driven by a d-dimensional fBm B, and that 
we have chosen to use expression ([5]) as a substitute to the log-likelihood function. We 
have thus reduced the initial maximization problem to the solution of V;£„(6') = 0. This 
will be performed numerically by means of a root approximation algorithm. 

Observe first that in order to define (l5l), the density of Yt{6) must exist for any t > 0. 
Let us thus recall the classical setting (given in [TH]) under which Yt admits a smooth 
density: 

Hypothesis 3.1. Let fi and a be coefficients satisfying Hypothesis \2JA For ^ G M™ and 
^ G 6, set a(^) = (t(^, ^)cr*(^, ^). Then we assume that 

(i) For any k > and ji, . . . ,jk G {1, . . . , m} we have 

2 

fPE E l|Vk..,,5l...,,,M-;^)lloo+ ||V;,,..,,a^,...,,a(-;e)|U < c., 

1=0 l<Pu...,Pi<q 

for a strictly positive constant Ck- 

(ii) There exists a strictly positive constant e such that {a{^] 6)7], r])M.m > e\ri\^m for any 
couple of vectors r],C, E M"\ uniformly in 6 E Q. 

Then the density result for Yt can be read as follows: 
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Theorem 3.2. Consider the stochastic differential equation (QP with initial condition 
a G M'". Assume Hypothesis \3.1\ is satisfied. Then, for any t > and 6* G O, the law of 
Yt{6) admits a C°° density, denoted by f{t, 6), with respect to Lebesgue's measure. 

In the sequel, we shaU suppose that the density /(t, ■; 6) exists without further mention, 
the aim of this section being to produce a probabihstic representation of /(t, ■; 6) for 
computational purposes. To this aim, we shall first give the equations governing the 
Malliavin derivatives of the processes Y{6) and VY{6), and then use a stochastic analysis 
formula in order to represent our log-likelihood. We separate these tasks in two different 
subsections. 



3.1. Some Malliavin derivatives. This section is devoted to a series of preliminary 
lemmas which will enable to formulate our probabilistic representation of /(t, ■; 6). Let 
us first introduce a notation which will prevail until the end of the paper: 

Notation 3.3. For a set of indices or coordinates (fci, . . . ,kr) of length r > 1 and 1 < 
j < r, we denote by {ki, . . . , kj, . . . , k,) the set of indices or coordinates of length r — 1 
where kj has been omitted. 

We now give a general expression for the higher order derivatives of Fj, borrowed 
from 



Lemma 3.4. Assume Hypothesis \2.'A and \3.1\ hold true. For n > 1 and (ii, . . . ,in) G 
{1, . . . , (i}"', denote by Z}*i'-'*"F/(6') the ri*^'' Malliavin derivative ofYliO) with respect to 
the coordinates B^^, . . . ,B^" of B. Then Z)'^'''*"y/(6'), considered as an element ofH®"", 
satisfies the following linear equation: for t > ri V ■ ■ ■ V 

n 



d rt 



+ [ ^^l,...,^M.rl:■■■:rn]e)ds + Y^ I al^_,Js;ri,...,rn,e)dBi, (14) 

JriV---Vr„ JriV---Vr„ 

where 

m 

ki,...,k„=l 
m 

fci,...,fc,y=i 

In the expressions above, the first sums are extended to the set of all partitions Ii, . . . ,1^, of 
{1, . . . , n} and for any subset K = {ii, . . . , i^} o/{l, . . . , n} we set for the derivative 

operator Dri'.'.'.'.'rv, ■ Notice that Dl}^'■■^'^^^Y^{9) = whenever t < ri V ■ • • V r„. 



The formulas above might seem intricate. The following example illustrate their use in 
a simple enough situation: 
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Example 3.5. The second order derivative Dlfj.^Y^^{9) can be computed as: 

JriVr2 Jri\/r2 

where 

m 

fc=i 

m 

fc=i 

and 

where we have used the convention of summation over repeated indices. 

Our formula for the log-hkehhood will also involve some derivatives of the process Y{6) 
with respect to the parameter 6. The existence of this derivative is assessed below: 

Proposition 3.6. Under the same hypothesis as for Lemma \3.4l the random variable 
Y^{d) is a smooth function of 6 for any t > 0. We denote by ViY^{6) the derivative of 
Y^{6) with respect to the l^^ element of the vector of parameters 9. This process satisfies 
the following SDE: 

ViYi{e) = [\di^i\Y^{e); e)ViY:{d) + Vifi'iYuiey, e)]du 

Jo 

+ E [\dcrHyu{e);e)ViY:ie) + Via^^iY^{ey,e)]dBi. 

Proof. The proof goes exactly along the same lines as for |34[ Proposition 4], and the 
details are left to the reader. 

□ 

We shall also need some equations governing the Malliavin derivatives of ViY{6). This 
is the aim of the following lemma: 

Lemma 3.7. For any I G {l,...,q} and n > 1, the process Vi-D*^''"'*"F(6') is n-times 
differentiable in the Malliavin calculus sense. Moreover, taking up the notations of Lem- 
ma \!^.4\ the process V/-D*^''"'*"y/(6') satisfies the following linear equation: fort > ri V 
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n 



, Tp, . . . , Tjj, 



p=l 



JriV---Vr„ ^_-|^ ^riV---Vr„ 



u;/iere = Vza^- ^^^...^^^ and = More specifically, /3];f^,...,i„ 

defined recursively by 

f^iu...,iA^',n,---,r^;0) 

m 

/lU-.-U/^ fci,...,fc^ = l 

p=i 

where we have set 

Notice that the same kind of equation (skipped here for sake of conciseness) holds true for 
the coefficients a-\ ■ . 

The next object we need for our calculations is the inverse of the Malliavin matrix 7yt(e) 
of Yt[ff)- Recall that according to (fTT]) . the Malliavin matrix of Yt[ff) is defined by 

■.= iYm = {{D.Y;ie),D.Y,\e)))^^^^^^^, (is) 

where we have set '-ft{0) := ^Yt{0) for notational sake in the computations below. We shall 
now compute 7r^(^) as the solution to a SDE: 

Proposition 3.8. The matrix valued process 'Ji'^{0) is the unique solution to the following 
linear equation in t]: 

vt{e) = a^\Yt{e);e)-y2 [\vu{0)&i{Yu{ey,e) + &J{Y^{ey,e)7]^]dBl 

1=1 

[n,{9)^{Y^{9y, 9) + 9)r]^{9)]du, (16) 



with 



aQ{Yt{9y d) = y2 / ^*^(^- d)(^''^iyr'iO); 9) \r - r'\'^"-^dr dr', i,i' = 1, . . . ,m 
Jo Jo 

and where the other coefficients a and (3 are defined by 
ai{Yu{9y, 9) = (dua^\Y^{9y 9)) and p{Yu{9y 9) = (9fc/(K(^); 9)) 

V / l<i' ,k<m V / l<i' ,k<m 

-(17) 
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Proof. The proof of this fact is an adaptation of |19[ Theorem 7] to the case of a SDE 
with drift. We include it here for sake of completeness, and we drop the dependence of Y 
on 6 for notational sake in the computations below. 

Let us start by invoking Proposition 12.71 and equation (HM in order to compute the 
product of two first-order Malliavin derivatives: 



DiY; Di,Yf = a'\Yr)a'''{Yr>) + 

m ( „t d 
k=l 1=1 



:i8) 



dBl 



dka^iYu) Dl,Y^ DlYt + dua'\Y^) DIY^ Dl,Y^ 

Moreover, recall that 7^ is defined by (fTSi) . Thus, the covariance matrix becomes 

j=i ^ ' ^ j=i 

Plugging f lTSj) into this relation, we end up with the following equation for 7**': 



7r = a 



d „t m 



I. 



k=l >- 



dkC7^\Y^)Y:' + dkCT'\Y^)Y: 



i'l f\/ \ „ik 



dBl 



t m 



WE 



k=l 



i k 



du. 



Using our notation f lT7|) and matrix product rules, we obtain that 7^ is solution to: 



7t 



/ {&i{Y^)iu + luaJ{Y^))dBl+ / C^{Y^)^u + lu~f{Y^))du. 
Jo Jo 



Consider now rj solution to ( TTO]) . Applying again Proposition \2.7\ it is readily checked 
that 7(?7( = Id for any t G [0, T], which ends the proof. 

□ 

Remark 3.9. Gathering equation f fT6|) and Proposition 12. 5[ it is easily seen that for any 
t > and 6 G 0, Yt{6) is a non degenerate random variable in the sense given at |32[ 
Definition 2.1.2]: we have det(7j~^) G L^^Q) for any p > 1. 

Now that we have derived an equation for 77 = 7"^, an equation for the Malliavin 
derivative of rj is also available: 

Proposition 3.10. For any I G and n > 1, the process rjt = •y^'^ is n-time 

differentiahle in the Malliavin calculus sense. Moreover, the process D^^'---^'^'^r]t satisfies the 
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following equation: for t > ri V • ■ ■ V 

n ki 

^ri,...,r„'lt \^) — 2-^ / J \'^rx,...,rk^i^Q -t>'ri,r2,...,ri.^_fc2 «o -t>'ri,...,r„_i.^ «0 ) 
kx=\ k2=l 

^—l ^riV...Vr„ JriV...Vr„ 

m. m 

At.,„(.;n,...,r„;^) = 5: J2 



and the same kind of equation holds for Cl\^ ^^{s;ri, . . . ,rn',0), with the coefficients (3 
replaced by ai. 

Proof. The proof of this proposition is based on Lemma 13.41 and the fact that ^^^^ — 



A-ldAx A~l 

dX ^x ■ 



dX 

□ 



Finally, one can also differentiate t] with respect to our standing parameter 6, which 
yields: 

Lemma 3.11. The derivative of the inverse of the Malliavin matrix rjt with respect to 9 
satisfies the following SDE 

VMO) = V^ao' - Yl / {^iVu{e)ae{Yu{e);e) + rj^{9)Vi[ae{Y^{9);9)] 

+Vi[aJ{Yu{e)- e)]T^^{e) + aJ{Yu{e)- e)ViVu{0)}dBi - [ {ViVu{e)^{Yu{ey, e) 

Jo 

+Vu{e)Vi[(3{Yu{9); 9)] + Vz[^^(F„(^); e)]riu{9) + ^^{Yu{9); 9)ViVu{9)}du, 
where Vi[^e{Yu)] = d^e{Y^)ViYu + VMY^) and V;[a,(K)] = dae{Y^)ViY^ + V,«,(r„). 

3.2. Probabilistic representation of the likelihood. We have chosen to represent the 
log-likelihood of our sample thanks to the following formula borrowed from the stochastic 
analysis literature: 

Proposition 3.12. Let F be a MJ^ -valued non degenerate random variable (see Re- 
mark \3.9[ for references on this concept), and let f be the density of F. For n > 1 
and G {!,..., m}", let be defined recursively by H(^j^-^{F) = 

j:T=iKilFy''DF^) ^nd 

m 

Hin,...,u)in = Y.^[{1fT' DF^H^n...,u-m) ^ (19) 
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where the Skorohod operator 6 is defined at Section \2. 2. 31 Then one can write 

fix) = E [l(^>,)i7(i,...,^)(F)] = E , (20) 

where 1(f>x) ■= YlT=i 1(f»>x,) and {F - x)+ := HI^il^' - 

Proof. The first formula is a direct application of [3^ Proposition 2.1.5]. The second one 
is obtained along the same lines, integrating by parts m additional times with respect to 
the first one. 

□ 

The formula above can obviously be applied to Yt{6) for any strictly positive t, since 
we have noticed at Remark 13.91 that Yt{6) is a non-degenerate random variable. However, 
the expression of H(^j^ j^-^{Yt{9)) given by (fT9l) is written in terms of Skorohod integrals, 
which are not amenable to numerical computations. We will thus recast this expression 
in terms of Young integrals plus some correction terms: 

Proposition 3.13. Under Hypothesis and {3Ji let us define Q^f := {-i-^YW%^ {9) 
for < s < t < T , p, j E {1, . . . , m} and i G {1, . . . ,d}. Consider p E {1, . . . , m} and a 

real valued random variable G which is smooth in the Malliavin calculus sense. Set 

m d „t m. d „t pt 

Up{G) = E E ^ / dBl-cuY^Y. / \GQ^ \r - s^^'drds, (21) 

i=l j=l -^0 i=l j=l -^0 

where the integral with respect to B is understood in the Young sense. Then the quantities 
^{ji,---dn)0^t{9)) defined at Proposition \3.12\ can be expressed as 

m 

Hin,...,u)iytm = J2u,„o...o U,, (F/(^)) . (22) 



Proof. It is an immediate consequence of Proposition 12. 9[ since we have noticed in our 
Remark 13.91 that Yt[9) is a non-degenerate random variable. 

□ 

The previous proposition is still not sufficient to warranty an effective computation 
of the log-likelihood. Indeed, the right hand side of f l^T]) contains terms of the form 
Dg [GQ^j*] , which should be given in a more explicit form. This is the content of our next 
proposition. 

Proposition 3.14. Set H(^j^^,„^j^^){Yt{9)) := Then the term Ds[Kj^,„j^Q^^i'] in m\) 

can be computed inductively as follows: 

(i) We have Ds[Kj^,,,j^Ql{'] = DsKj^,,,j^ Q^f + Kj^ j^ DsQ^t, and -D^Q^f ^■^ computed by 
invoking Proposition \3.1(^ for the derivative of 'j^^ and Lemma \3.4\ for the derivative of 
Yt{9). We are thus left with the computation of DgKj^ ^ j^. 

(ii) Assume now that we can compute n — r Malliavin derivatives of Kj^ j^. Notice that 
this condition is met for r = 0, since Yt{9) itself can be differentiated n times in an 
explicit way according to Lemma[J^ again. Then for any ji, . . . , jV+i and k < n — r — 1, 
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the quantity i^ji...jv+i can be differentiated k times, with a Malliavin derivative given by 

e=i j=i 

-CH f /*/^SL.P. QZ)\ri - r^r-'dr.dr,. (23) 
Jo Jo 

Proof. We focus on the induction step (ii), the other one being straightforward: for a 
smooth random variable W, one easily gets by induction that 

Di^::^s{w) = Di^:t:r!:wr, + ^(/^^--^w^). (24) 

Suppose we know the n — r Malliavin derivatives for f/j,, o ■ ■ ■ o Uj^{F) := Kj-^,,,j^. Recall 
moreover that 

Applying directly relation (12^ we thus get, for k < m — 1: 

k 



Our formula ( l23l) is now obtained by applying Proposition 12.91 to the Skorohod integral 
5{Dl...pSK,r...n g.)) above. 

□ 



Example 3.15. As an illustration of the proposition above, we compute U2 o Ui{F) for 
F = y^*, i G {1, . . . , m} and our (i-dimensional fBm B. 

Write first UiiY^) = 6(Y^ (7^^)"'^"'^ D^^Y^), and since this quantity has to be expressed 
in a suitable way for numerical approximations, we have 



t ft 



n=i ii=i-^o Jo 



\ui — 'U2I duidu2, 



where Q is defined at Proposition 13 . 1 31 and where the first integral in the right hand side is 
understood in the Young sense. In order to compute the second one, we have to compute 
Malliavin derivatives. This is done through Lemma [3^ for Y and Proposition 13. 101 for Q. 

We now have to differentiate Ui{Y^ ): the derivation rules for Skorohod integrals imme- 
diately yield 



2ij2 ^ 
t ) 
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Once again, the Skorohod integral above is not suitable for numerical approximations. 
Write thus 



i2=i i2=i 



'32 







and compute the Malliavin derivatives of the products YQ thanks to Lemma 13.41 for Y 
and Proposition 13.101 for Q. Once this is done, just write 

U2{u,{y;)) = 6{u,{y:)q^,) 

d d rt 

\U2 — Ui\^^~^dUidU2- 



32=1 ,2=1 



In order to give our formula for the derivative of the log-likelihood, we still need to 
compute the derivative with respect to 9 of H(^j-^ j^^{Yt{6)). For this we state the following 
lemma 

Lemma 3.16. The derivative with respect to 9 of Up(Yt{9)) can he written as 



3=1 

d rt rt 



Jo 



^^E/ / "^liDiYmQ'rnOW-sl'^-'drds, 
Jo Jo 



where ViY^{9) is computed according to Proposition \3.6[ and Vi[DiY^] is given by Lem- 
ma \3.'7[ As far as ^i[Qst{^)] ^■^ concerned, it is obtained through the following equation: 

Vimm = Viv^J{9)D,Y,\9) + r/f(^)V/[D,F/ 



t 



where the expression for V irfj {9) is a consequence of Lemma \3.11\ 

We are now ready to state our probabilistic expression for the log-likelihood function (|5]). 

Theorem 3.17. Assume Hypothesis \2.3\ and \3.1\ hold true. Let yt., i = 1, . . . ,n be the 

observation arriving at time ti. Let also Yt- be the solution to the SDE (QP at time ti. Then, 
the gradient of the log-likelihood function admits the following probabilistic representation: 



Wi{9) = E 



liYtje)>yt^) H(^i^,„^rn)(YtX9) 



(25) 
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and 



(26) 



where (i) H(^j^^ j^-^{Yt^{6)) is given recursively by f2E) and computed at Proposition \3.1 4 



(ii) ViYti{6) is given by Proposition \3.6\ (iii) Vi-ff(i,...,m,i,...,m) is obtained by applying Lem- 
ma VJ.lbX 

Proof. Recall that under Hypothesis 12.31 and Yt{9) admits a density /(t, ■; 9) for 
any t > and 6' G O. Moreover, we have defined ln{d) as in{0) = Yl^=i^^ifi'tiiyti', 
Thus 

Now Wi{9) can be expressed like (125|) by a direct application of (120|) . first relation. As far 
as Vt(^) is concerned, write 

f{t,,y,- ^) = E [{YtXe)-yt:)^H^,_rn^,...,rn){Yum] , 

according to the second relation in (120|) . By using standard arguments, one is allowed to 
differentiate this expression within the expectation, which directly yields (126|) . 

□ 

4. Discretization of the log-likelihood 

The expression of the log-likelihood that we derived in Proposition 13.171 is a fraction of 
two expectations that do not have explicit formulas even in the one-dimensional case. In 
addition, our goal is to find the root of this non-explicit expression, the ML estimator, 
which is an even harder task. To solve this problem in practice we first use a stochastic 
approximation algorithm in order to find the root of VilniO). In each iteration of the 
algorithm we compute the value of the expression using Monte-Carlo (MC) simulations. 
For each Monte-Carlo simulation, since we do not have available an exact way of simulating 
the kernels of the expectation, we use an Euler approximation scheme. More specifically, 
we simulate using Euler approximation terms such as Y^, DYt, which are solutions to 
fractional stochastic differential equations. 

Therefore, in our approach we have three types of error in the computation of the 
MLE: the error of the stochastic approximation algorithm, the Monte-Carlo error and the 
discretization bias introduced by the Euler approximation for the stochastic differential 
equations. Our aim here is to combine the Monte Carlo and Euler approximations in an 
optimal way in order to get a global error bound for the computation of Vi^„(6'). 

4.1. Pathwise convergence of the Euler scheme. The Euler scheme is the main 
source of error in our computations. There is always a trade-off between the number of 
Euler steps and the number of simulations, but what is usually computationally costly is 
the number of Euler steps. This is even worse when we deal with fractional SDEs, since 
the rate of convergence depends on H and the closer the value of H to 1/2, the more 
steps are required for the simulation. 
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In this section, we compute the magnitude of the discretization error we introduce. We 
measure the bias of the Euler scheme via the root mean square error. That is, we want to 
estimate the quantity sup^^^Q rp^(E\YT-{9) — where Yt{9) is the solution to the 

SDE ([I]) and F/^(6') is the Euler approximation of ¥^{6) given on the grid {t^; k < M} 
by 

d 

= y'^iO) + f^iY^fie)^ 0){rk^i -r,) + Yl ^'(K'^iO); (27) 



0, 



M - 1. Notice 



in which we denote 6B^^'^ = - and = ^ for k 

that those estimates can be found in [SI [TTl 120] • We include their proof here because it 
is simple enough, and also because they can be easily generalized to the case of a linear 
equation. This latter case is of special interest for us, since it corresponds to Malliavin 
derivatives, and is not included in the aforementioned references. 



Notation 4.1. For simplicity, in this section we write Y := Y{6). 

Proposition 4.2. Let T > and recall that Y^^ is defined by equation l[2l\) . Then, there 
exists a random variable C with finite moments such that for all < H and H > 1/2 
we have 



\Yt-Y\\^,T<CTM'-^^ 



(28) 



Consequently, we obtain that the MSB is of order 0{M 



Proof. In order to prove (128|) we apply techniques of the classical numerical analysis for 
the flow of an ordinary differential equation driven by a smooth path. Namely, the exact 
flow of ([T]) is given by $(i/; s, t) := F^, where Yt is the unique solution of ([T]) when t G [s, T] 
and the initial condition isYg = y. Introduce also the numerical flow 



d 



vl>(y; r,, r,+i) := y + /i(y)(r,+i - r,) + a^"(y)55J4^, 



(29) 



where Tk 



M ' 



M — 1. Thus, we can write that 



M 
-'^0 



0,. 



a. 



For q > k we also have that 

*(l/; n, Tq) := ^'(■; Tq-U rq) O ^'(■; rg_2, Tq^l) o 

The one-step error computes as 



Tk 



ds + 



Tk 



M 



'^{y\Tk,Tk+i] 



^(n) - cr(y) 



(30) 
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Furthermore, since Y ^ C and B E for 7 > 1/2, using ([7]) we have 



Tk 



< c^\\da\U\\Y\\^\\B\\^ 

< c^,. \\da\\^ \\B 



T 



M 



27 



T 



M 



27 



where we used the fact that Hl^H-y < Co-||-B||y''' (see Proposition 12. 5p . Similarly, for the 
drift part we have 



Tk+l 



ds 



< \\dfi\\oo \\y\ 



Therefore, the one-step error (130|) satisfies 

'-/t.cr II II 7 



T 



M 



7+1 



M 



7+1 



T 



M 



27 



(31) 



Now, we can write the classical decomposition of the error in terms of the exact and 
numerical flow. Since Y^'^ = $(F/'^; r^, r^) and Y^^ = ^{Y^^; Tq, r^) we have 



q-l 

Y^^ - = $(K„; To, n) - $(n,; r„ r,) = J] ($(F,f ; r,, r,) - <I>(K,^,; r,+i, r,)) . (32) 



fc=0 



Since $ (|y^f ; r^, r^) = $ ($(i;f ; r^, r^+i); Tfc+i, we obtain 



M 



= $($(F,f;r,,r,);r,+i,r,j -$(F^f^^;r,+i,r, 
< CHII5II7) |<f(nf;r,,r,+i)-n!ij, 



where we have used the fact that 

|$(a;t,s)-$(/3;t,s)<Cr(||i?||^)|a-/3|, 

where Ct is a subexponential function (see Proposition 12.51 again). Moreover, owing to 
relation (IM]) . 

27 

(33) 



l<^>(>^';^.,r,)-F^^J = |r, 
Therefore, replacing ( !33|) in ( !32|) for any q < n we obtain 



T 



9-1 



Mr, -fr,| ^ L^,(7 II-DII7 ^ 



fc=0 



T 

M 



27 
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Let US push forward this analysis to Holder type norms on the grid < ri < . . . < r„ = T. 

We have ioi q > p 



6 Y-Y 



M 



r„ r,) - j - (^(i;^; r„ r,) - Y^^ 



*(nf;r„r,)-<|.(F,^;r„r< 



M. , 



Similar to the calculations leading to f l33|) we obtain 



^(nf;r„r,)-$(nf;r„r. 
Moreover, owing to Proposition 12.51 part (2), observe that 



k=p 



T 

M 



27 



($(K^;rp,r,)-<|.(F4^;r„ 




- 








7 



Ml 



Consequently, we have that for < p < q < M 

q-l 



s{y-y'' 

which easily yields that 



<c\\\b\\\^'/''){y: 

k=p 



T 
M 



27 



k=0 



T 

M 



27 



d Y-Y 



M 



sup 

p,g=0,l,...,M-l,p^g 



< c{\\B\L) M^-27_ 



By "lifting" this error estimate to [0,T] and since |t — s| < T/M, 

\\Yt-Y\\^^^,T<C M^-^\ 



(34) 



which concludes the first part of the proof. 

Regarding the order of the Mean Square Error, it suffices to note that the constant C 
has finite moments. □ 

As mentioned before, an elaboration of Proposition 14.21 is needed in the sequel. Indeed, 
in the expression of the log-likelihood in Proposition 13.171 we need to discretize more 
complicated quantities of the underlying process, such as (fT4|) or (fT6|) . To this aim, let us 
notice first that all those equations can be written under the following generic form: 



Zt = a+ f euZudu+ f eu'ZtdBl 
Jo Jo 



(35) 
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where ^"^ are stochastic processes with bounded moments of any order. The corre- 
sponding Euler discretization is 

C = C + C.^-ir,,, - n) + tC;^Z^, SB--^^, (36) 

and we give first an approximation result in this general context: 

Proposition 4.3. Let T > 0, and consider the W^-valued solution Z to equation (E^, 
where a G W, G M^'^ and we suppose that H^^H^ and H^^'-'H^ belong to U'{Q) for 

any value of p > 1. Let be defined by equation (E^. Then, there exists a random 
variable C with L^ finite moments, such that for all •y < H and H > 1/2 we have 



\Z-Z\ 



(37) 



Consequently, we obtain that the Mean Square Error is of order 0{M 



1-27N 



Proof. We follow a similar approach as in the previous proposition. Thus, the exact flow 
is equal to $((^; s, t) := Zt, where Zt is the unique solution of equation (135|) when t G [s, T] 
and the initial condition is Zg = (■ Consider also the numerical flow 



where = kT/M, n = 0, . . . , M — 1. Thus, we have 



7M 



a. 



M 



In this case, the one-step error can be written as 



$(C; n, Tk+i) - ^iC n, Tk+i] 



'^k + l 



We now treat each term separately. Therefore, using the fact that < exp(c||i?||y''^), 

which is recalled at Proposition 12.51 point (4) in a slightly different context, we have that 

27 



/ il{Zs 



QdB, 



< \\Zi% \\B\ 



T 



< 



exp( 



Similarly, we also have 



^siZ, - C)ds 



Tk 



< \\zi^ 



M 
\B\ 

T 



T 



M 
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exp(||i?||;/^) 



M 
\B 



27 



T 



M 
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Therefore, the one-step error satisfies the foUowing inequahty 



|rfc| <c^exp(||B||y^) \\B\ 



M 



27 



Afong the same fines as for Proposition 14. 2[ tfie decomposition of tfie error in terms of 
tfie exact and numerical fiow becomes 



k=0 



and tfie same inequafities aUowing to go from ( 15^ to fl33p yield 



T 



M 



27 



Tfie claim of tfie proposition follows now as in Proposition 14.2 



□ 



We now use tfie previous proposition in order to approximate tfie kernels of tfie expec- 
tations in V;£„(^). Let us first introduce tfie following notation: 

Notation 4.4. Let Wi{9), Vi{9) as in ^25\) and respectively and define Wi{9) and 
Vi{9) as 

wm = l^y^^^o)>n^) H^,_^)(Y,Se)) (38) 

v,{e) = ViYt^ie) l(y,je)>j;0^(i,.,"M,.,-)+ (^t. W-yt,)^V,i/(i,.,™,i,.,^). (39) 

Let also wf^ and vf^ to be the Euler discretized versions of (E^j and (3^) using \4.3\ and 
set Wj^{9) = E[M;f ] and V^\9) = E[t;f ]. 



Our convergence result for Vi£„(6') can be read as follows: 
Theorem 4.5. Recall from Theorem \3 . 11\ that Viin{d) can be decomposed as Viin{(^) 



Yl^=i W7e)' ^^^'^ following approximation result holds true: 



Ml 



for a strictly positive constant c. 

Proof. We focus on tfie bound for \Vi{6) — V/^{6)\, tfie otfier one being very similar. 
Now, applying Proposition l4.3l to tfie particular case of tfie equations governing Malliavin 
derivatives, we easily get 

\\vt - vl,T < C2M^-^\ 

for an integrable random variable 6*2- Tfie proof is now easily finisfied by invoking tfie 
inequality 

\vm-v,''{e)\<mvt-v\\,,T\- 

□ 
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Remark 4.6. We have given two separate approximations for Vi{6) and Wi{6). In order 
to fully estimate {Vi{9)/Wi{9)) - {V/^^ {9)/Wj^{9)), one should also prove that Wi{9) is 
bounded away from 0. This requires a lower bound for densities of differential equations 
driven by fractional Broawnian motion, which are out of the scope of the current article. 

4.2. Efficiency of the Monte Carlo simulation. In this section we aim to study the 
computational tradeoff between the length of a time period in the Euler discretization 
(i.e. 1/M) and the number of Monte Carlo simulations of the sample path (i.e. A^). In 
order to do so we consider wf'^ and vf '^ as above. 

Recall that, given t units of computer time, the Monte-Carlo estimators for Wi{9) and 
Vi{9) can be written as 

cAt ^) ^ Co(t ^ '''^ 

^iK^^m) k=i ^^\^^m) k=i 

where i > 1} (resp. {vf j; £ > 1}) is a sequence of i.i.d. copies of wf^ (resp. of v^ ^), 

and Ci{t, jj),C2(t, -p) are the maximal number of runs one is allowed to consider with t 
units of computer time. Using the result by (TU] we can state the following proposition: 

Proposition 4.7. Let N be the number of Monte Carlo simulations and M the number 
of steps of the Euler scheme, then the tradeoff between N and M for computing Wi{9) 
(and similarly Vi{9) ) is 

N X M^"^ 

for all 1/2 < '-f < H and 7 = Tm{d + 1), where T is the time horizon, m the dimension 
of the observed process and d the dimension of the noise process. 

Proof. We discuss the proof only for Wi, by following exactly the same steps we can obtain 
the same result for Vi. 

We only need to check that our process w satisfies the conditions of Theorem 1 in |10| . 

(i) We can easily see that the discretized wfj converges uniformly to Wt^. 

(ii) In addition, we have bounded moments of w^., thus E[Wj^] — E[w^,]. 

(iii) From Proposition 14.51 we have that the rate of convergence of the Euler scheme of 

is M^-^T, for 1/2 < 7 < H. 

(iv) The computer time required to generate wf^ is given by r(l/M), which satisfies: 

r(l/M) = Tm{d + 1)M = 7M 

where T is the length of the time period, m is the dimension of the SDE, d is the 
dimension of the fBm and M is the number of Euler steps. 

By applying Theorem 1 (by jlO]) the optimal rule for choosing the number of Monte- 
Carlo simulations and the number of Euler steps is chosen such that the asymptotic error 
is minimized. Therefore, for t the total budget of computer time, as t increases, then the 
Euler step should converge to zero with order -^2-47 '-'^ equivalently: 

1 1—27 7+2-47 
— X 1 7+2-47 thus t X M 1-27 . 
M 

But the number of operations needed for an arbitrary Monte Carlo simulation is equal 
to ^MN. Thus, we finally obtain that N x M"^W^-\ □ 
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4.3. Discretization of the score function. Consider the foUowing discretized version 
of the score function, i.e. Vilnid)'- 



^MO) = ^:= f^r-M ' (40) 



where wf^, w^i^, . . . and w j^, u^, . . . are iid copies of wf'^ and v^^ respectively. Our aim in 
this section is to give a global bound for the mean square error obtained by approximating 

Viinie) by Viinie). 

Proposition 4.8. The discretized score function Viin{(^) converges to the continuous 
score function V;£„(^) with rate of convergence of order M~^'^'^~^\ where 1/2 < 'j < H 
and M is the number of Euler steps used in the discretization. 

Proof. We discuss the idea of the proof for the Wi term first: 



2 



N ^2 



^{W,-W,] = E|->>g-E[m 



\ fe=i 

/I ^ 1 ^ 1 ^ 

^ fc=l k=l k=l 



Thanks now to the independence property between Monte Carlo runs, we get 

k=l ^ k=l ' 

1 ^ 1 

= — ^ (Euler MSE)^ + (Monte Carlo MSE)^ x {M^-'^^f + — , 
fc=i 

and thus 



MSE (W, - VT.) X W (Mi-27)2 + i_. 



_ 7+2—47 

Now, if we use Proposition 14. 7[ i.e. x M 1-27 , for all 1/2 < 7 < if, and 
7 = Tmid +1), where T is the time horizon, m the dimension of the observed process 
and d the dimension of the noise process, we have 



MSE (Wi - VTi) X V M2-47 + MT^+^ X 

since the first is the dominant term above. 

Following the same procedure, we can show that MSE(yi - x M^-^t and thus the 
claim of the proposition follows easily. 

□ 

Remark 4.9. In Proposition 14.81 the rate of convergence is independent of the dimension 
of the problem, i.e. it is independent of the parameter 7 = Tm{d + 1). 
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5. Numerical Examples 

In this section our aim is to investigate the performance of the suggested maximum hke- 
hhood method in practice. We study the one-dimensional fractional Ornstein-Uhlenbeck 
process, a linear two-dimensional system of fractional SDEs and then some real data given 
by a financial time series. Before presenting our results, we first discuss some technical 
issues raised by the algorithmic implementation of our method. 

The goal is to find the root of the quantity Vi£n(6') with respect to 9. We can divide this 
procedure in two parts. The first part consists in computing the root of the log-likelihood 
using a stochastic approximation algorithm. This is a stochastic optimization technique 
firstly introduced by Robbins and Monro (1951) that is used when only noisy observations 
of the function are available. In our case it is appropriate, since we want to solve 

ViUO) = 0, 

where V;£n(^) is given by Theorem 13.171 and has to be approximated by Vi^„(6'). Thus, 
the recursive procedure is of the following form 

4+i = 4-afcWn(^fc). (41) 

where V;£„ is the estimate of Vj£„ at the k-th iteration based on the observations and 
Cfc is a sequence of real numbers such that Xlfc^i = oo and Xlfc^i ^\ < Under 
appropriate conditions (see for example |1]), the iteration in (HT]) converges to 9 almost 
surely. The step sizes satisfy Ofc > and the way that we choose them can be found 
in [2^]. 

The second part consists of the computation of V;£„(^fc) at each step of the stochas- 
tic approximation algorithm. Thus, for a given value of 9k (the one computed at the 
k-i\i iteration) we want to compute Viini9k) when we are given n discrete observations 
of the process: yt^, i = 1, . . . ,n. Here, we describe the main idea of the algorithm we 
use for only one step. Thus, assume that we are at and at time tj we obtain 

the i-th observation. We want to compute Wi{9) and Vi{9) according to expressions (I25p 
and f l26|) respectively. To compute the expectations we use simple Monte-Carlo simula- 
tions.Therefore, we discretize the time interval into steps 

U-i = So < Si < ■ ■ ■ < Sn = U. 

From each simulated path (apart from that of fBni) we only need to keep the terminal 
value which is the value of the process at time tj. The algorithm is the following 

(1) Simulate values of fBm in the interval using for example the circulant 
matrix method (any exact -preferably- simulation technique can be used). 

(2) Using the simulated values from step [1] and an Euler scheme for the SDE (1), 
simulate the value of the process at time tj. For example, for k = 0, . . . , N 

(3) Using step [2] and the observation at time tj, compute the indicator function 



28 



A. CHRONOPOULOU AND S. TINDEL 



(4) Using step[T]and an Euler scheme simulate DiY^, as given in Lemma 3.3 for n = 1 
-first Malliavin derivative-. 

(5) Using step [Hand an Euler scheme simulate rj^-' , k, j = 1, . . . ,m, as given in Propo- 
sition 3.7. 

(6) Steps m and [5] are used to compute Q^J., p G {1, . . . ,Tn}, j G {1,. . . ,d} as defined 
in Propositions 13. 12] and [31141 

(7) Simulate the Malliavin derivative of the product Ds[YtQ^-j-]. 

(8) Using the previous steps, numerical integration for the double integral and nu- 
merical integration for the stochastic integral we compute Up{Yt.{6)) as defined in 
Proposition 13.121 

(9) Recursively compute H(^i^, ^^m^{YtXO)) as given in (19). 

(10) Combine steps [3] and [9] to obtain the kernel Wi{6). 

(11) We repeat steps [T] through [TO] times and we average these values to obtain an 
estimate for the expectation Wi{6). 

Using a similar procedure we can obtain an estimate for the expectation Vi{6). Finally, 
for each z = 1, . . . , n we compute Vi{6)/Wi{6) and sum over i to obtain the desired value 
of the log- likelihood at 6k- 

We have completed the study of our numerical approximation of the log-likelihood, and 
are now ready for the analysis of some numerical examples. 

5.1. Fractional Ornstein-Uhlenbeck process. Though our method can be applied to 
highly nonlinear contexts, we focus here on some linear situations, which allow easier 
comparisons with existing methods or exact computations. Let us first study the one- 
dimensional fractional Ornstein-Uhlenbeck process, i.e. 

dYt = -XYtdt + dBt, (42) 

where the solution is given Yt{X) = e~'^^*~^^dBs (notice the existence of an explicit 
solution here). In this case our methodology is quite simplified. The log- likelihood can 
be written as follows: 



i=l 



E 


d,Yt{\) l(y,(,)>,)iJ(i,i)(A) + 


(Ft(A)-y)^aAi/(i,i)(A) 


E 


^{ytW>y) ^{1)( 


Vi(A),l) 





The Malliavin derivative of Yt{X) satisfies the following ODE 

D,Yt{X) = l-\ D.YMdu, 
with solution DsYt{X) = e~^^ l{s<t}- The corresponding norm is 

||D.y;(A)f = ch f Te-^^^+^'V - v\''''-^dudv. 

J s J s 

The higher order derivatives of ^^(A) are equal to zero. Therefore, 

*^ V \\D.Yt{X)f Js 
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and thus 



i/(i,i)(A) = / / e-^("+'')rfi?„di?.-c^p.F,(A)|r 

ll-f^- J'^U'^J II Js 

The derivative with respect to the unknown parameter A satisfies 



with solution d\Yt{\) = j^{t — s)e ^^dBg. The last term we need to compute is: 

1 



5Ai^(i,i)(A) 



t rt 



WD.YtiXW 
-2cH\\D.YtiX)f 



J s J r 

[u + tOe-^^^+'^V - v\^^-'^dudv 
^^-'^dudv 



u — v\ 



\\D.Y,{XW 

Now, we compute the MLE following the algorithm we described above. The results 
we obtained are summarized in the following table: 



True A 


MLE A 


Standard Error 


0.5 

4 


0.497 
3.861 


0.00369 
0.00127 



Remark 5.1. The value of H used for the simulation of the process is 0.6. The number of 
observations is n = 50, the number of Euler steps is M = 500, the number of stochastic 
approximation steps is K = 50 and the number of MC simulations = 500. 

5.2. Two-dimensional fractional SDE. In this section we study the following system 
of fractional OU processes: 



dY}^^ = -l3Yi^'^dt + l3dBf\ 



(43) 



In this case, the computations are more involved even though the SDEs are linear functions 
of Y . Furthermore, the parameter we want to estimate is two-dimensional as well {9 = 
(a, which complicated the optimization procedure. Therefore, instead of computing 
only one derivative, we need to compute both derivatives with respect to a and (3 and 
then compute the solution of the system of two equations 



where 



i=l 

X {E[ViYt{a,f3) l(yt(Q:,/3)>y) -f^(l,2,l,2)(a, P) + P) - y) + V/i/(i,2,l,2)(tt, /?)]} 
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and I = a oi (3. The Malliavin derivative of Yt computes as follows: 



DsY, 



(1) 



[i — a 



DsY, 



(2) 



The covariance matrix 7^ is given by {{D.Y^ , D.Y^))i<ij<2- The inverse of the covariance 
matrix satisfies the following SDE 



It 



-1 



where 



M 



a 

(3 



Now, it remains to compute the quantities -ff(i,2) and i^(i,2,i,2)- This can be done using 
the recursive formulas in Proposition 3.12, but we need to keep in mind that higher order 
derivatives of Y are equal to zero, thus they will be simplified. Indeed, 
2 



|2_H'-2 



drds 



H(i) (Yt) = Y.yJ {^:'yW,Y^dBi -CH j j D.Y^Q,, 
Jo Jo Jo 

Moreover, we can easily see that 

H^,^2){Yt) = H^i){Yt) f QstdB,-CH f f D,H^^){Yt)Qr 
Jo Jo Jo 

Hii,2,i,2){Yt) = H^i,2,i){Yt) ! QstdBs-CH [ [ DsH^,,2,i){Yt)Q 

Jo Jo Jo 

Of course, recall that Q^J = {'-f'^y^DgY^. In practice, these quantities are computed re- 
cursively. The last step is to compute the derivative of -ff(i,2,i,2)(^t) with respect to a and 
(3, which in this case is not as complicated and compute the MLEs using the algorithm 
discussed in the previous section. The table below summarizes our results, and we have 
plotted the corresponding histograms in Figure [H 



rtr 



\2H-2 



drds 



Parameter 


True Value 


MLE 


Standard Error 


a 


2 


2.003 


0.0518 


/3 


4 


3.987 


0.0157 



Remark 5.2. The value of H used for the simulation of the process is 0.6. The number of 
observations is ?i = 50, the number of Euler steps is TV = 500, the number of stochastic 
approximation steps is K = 50 and the number of MC simulations M = 500. 

5.3. Application to financial data. One of the most popular applications of fractional 
SDEs is in finance. Hu and Oksendal, [20], introduced the fractional Black-Scholes model 
in order to account for inconsistencies of the existing models in practice. More specifically, 
the stock price is described therein by a fractional geometric Brownian motion with Hurst 
parameter 1/2 < < 1. The choice of this model is based on empirical studies that 
displayed the presence of long-range dependence on stock prices, for example in Willinger, 
Taqqu and Teverovsky, 



ON INFERENCE FOR FRACTIONAL DIFFERENTIAL EQUATIONS 

Histogram of alpha Histogram of beta 



31 



1.90 1.95 2.00 2.05 2.10 



3.80 3.85 3.90 3.95 4.00 4.05 4.10 4.15 



alpha 

Drift Parameter. 



Diffusion Parameter. 



Figure 1. Empirical Distribution of the estimators for a and /3. 

However, the presence of fractional Brownian motion in the model allows for arbitrage 
in the general setting. It has been shown that arbitrage opportunities can be avoided in a 
number of ways, for example the reader can refer to Rogers |39], Dasgupta and Kallianpur 
and Cheridito |5]. We choose to model the stock price as as follows: 

dSt = fiStdt + adBt, (44) 

where 5 is a fractional Brownian motion with Hurst index 1/2 < if < 1. For this SDE 
(as well as for a more general class of fractional SDEs) Guasoni, [13], proved that there 
is no arbitrage when transaction costs are present. 

Our goal is to estimate the unknown parameters n and a based on daily observations of 
the S&P 500 index (data from June 2010 until December 2010). Since the Hurst parameter 
is piece-wise constant, we devide the data in three groups (of 50 daily observations each) 
and we compute for each one the Hurst index using the Rescaled-Range (R/S) statistic. 
We obtain that for the first group of data Hi = 0.59, for the second H2 = 0.63 and for 
the third one = 0.61. For the different groups, we apply our maximum likelihood 
approach in order to estimate fi and a. The estimates are summarized in the following 
table: 



Estim. Parameters 


Group 1: Hi = 0.59 


Group 2: H2 = 0.63 


Group 3: H3 = 0.61 


A 
a 


0.015 (0.0123) 
0.352 (0.058) 


0.019 (0.0144) 
0.339 (0.046) 


0.011 (0.0214) 
0.341 (0.024) 



Remark 5.3. The volatility is computed in years. In addition, during this period of time 
the historical volatility is around 0.38, which is coherent with our own estimation. 
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